A Keyword Extraction Method by Document Expansion
نویسندگان
چکیده
منابع مشابه
Automatic Keyword Extraction from Historical Document Images
This paper presents an automatic keyword extraction method from historical document images. The proposed method is language independent because it is purely appearance based, where neither lexical information nor any other statistical language models are required. Moreover, since it does not need word segmentation, it can be applied to Eastern languages where they do not put clear spacing betwe...
متن کاملTerm Weighting in Short Documents for Document Categorization, Keyword Extraction and Query Expansion
This thesis focuses on term weighting in short documents. I propose weighting approaches for assessing the importance of terms for three tasks: (1) document categorization, which aims to classify documents such as tweets into categories, (2) keyword extraction, which aims to identify and extract the most important words of a document, and (3) keyword association modeling, which aims to identify...
متن کاملA Document Content Extraction Model Using Keyword Correlation Analysis
Owing to the drastic development of the information and Internet technologies, large amount of information and documents can be easily accessed through the electronic network. In addition to the efficiency of document acquisition, another typical issue for document management is the document content extraction. In order to provide the critical contents of a document to the knowledge requester, ...
متن کاملKeyword Extraction from a Single Document Using Centrality Measures
Keywords characterize the topics discussed in a document. Extracting a small set of keywords from a single document is an important problem in text mining. We propose a hybrid structural and statistical approach to extract keywords. We represent the given document as an undirected graph, whose vertices are words in the document and the edges are labeled with a dissimilarity measure between two ...
متن کاملA probabilistic method for keyword retrieval in handwritten document images
Keyword retrieval in handwritten document images (word spotting) is very challenging given that OCR accuracy is not yet adequate for handwritten scripts, specially with large lexicons. Various proposed approaches build indices on information such as image features or OCR scores and have improved the performance of the traditional approach that builds index on OCR’ed text. In this paper, we impr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Natural Language Processing
سال: 2007
ISSN: 1340-7619,2185-8314
DOI: 10.5715/jnlp.14.67